Cluster Identification in Metagenomics – A Novel Technique of Dimensionality Reduction through Autoencoders
نویسندگان
چکیده
Analysis of metagenomic data is not only challenging because they are acquired from a sample in their natural habitats but also the high volume and dimensionality. The fact that no prior lab based cultivation carried out metagenomics makes inference on presence numerous microorganisms all more challenging, accentuating need for an informative visualization this data. In successful visualization, congruent reads sequences should appear clusters depending diversity taxonomy sequenced sample. represented by oligonucleotide frequency vectors inherently dimensional therefore impossible to visualize as is. This raises dimensionality reduction technique convert these higher sequence into lower purposes. process, preservation genomic characteristics must be given highest priority. Currently, purposes metagenomics, Principal Component (PCA) which linear t-distributed Stochastic Neighbor Embedding (t-SNE), non-linear technique, widely used. Albeit wide use, techniques exceptionally suited domain with certain shortcomings weaknesses. Our research explores possibility using autoencoders, deep learning has potential overcome prevailing impediments existing eventually leading richer visualizations.
منابع مشابه
A novel dimensionality reduction technique based on kernel optimization through graph embedding
In this paper, we propose a new method for kernel optimization in kernel based dimensionality reduction techniques such as Kernel Principal Components Analysis (KPCA) and Kernel Discriminant Analysis (KDA). The main idea is to use the graph embedding framework for these techniques and, therefore, by formulating a new minimization problem to simultaneously optimize the kernel parameters and the ...
متن کاملDeep Autoencoders for Dimensionality Reduction of High-Content Screening Data
High-content screening uses large collections of unlabeled cell image data to reason about genetics or cell biology. Two important tasks are to identify those cells which bear interesting phenotypes, and to identify sub-populations enriched for these phenotypes. This exploratory data analysis usually involves dimensionality reduction followed by clustering, in the hope that clusters represent a...
متن کاملDimensionality Reduction of Astronomical Spectroscopic Data using Autoencoders
Background. Recent major advances in the understanding of galaxy owe a great deal to highly successful galaxy surveys conducted with the Sloan Telescope. With massive amount of data, modern techniques such as machine learning comes in naturally for efficient handling. While many applications focused on characterizing and classifying astronomical objects within a predefined area of interest, uns...
متن کاملA Dimensionality Reduction Technique for Collaborative Filtering
Recommender systems make suggestions about products or services based on matching known or estimated preferences of users with properties of products or services (contentbased), properties of other users considered to be similar (collaborative filtering), or some hybrid approach. Collaborative filtering is widely used in E-commerce. To generate accurate recommendations in collaborative filterin...
متن کاملA Novel Effective Distributed Dimensionality Reduction Algorithm
Dimensionality reduction algorithms are extremely useful in various disciplines, especially related to data processing in high dimensional spaces. However, most algorithms proposed in the literature assume total knowledge of data usually residing in a centralized location. While this still suffices for several applications, there is an increasing need for management of vast data collections in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International journal on advances in ICT for emerging regions
سال: 2021
ISSN: ['1800-4156', '2550-2794']
DOI: https://doi.org/10.4038/icter.v14i2.7224